context vector
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Asia > Middle East > Israel (0.04)
- Information Technology > Artificial Intelligence > Vision (0.96)
- Information Technology > Artificial Intelligence > Natural Language (0.93)
- Information Technology > Sensing and Signal Processing > Image Processing (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > Canada (0.04)
Energy-Based Cross Attention for Bayesian Context Update in Text-to-Image Diffusion Models
Despite the remarkable performance of text-to-image diffusion models in image generation tasks, recent studies have raised the issue that generated images sometimes cannot capture the intended semantic contents of the text prompts, which phenomenon is often called semantic misalignment. To address this, here we present a novel energy-based model (EBM) framework for adaptive context control by modeling the posterior of context vectors. Specifically, we first formulate EBMs of latent image representations and text embeddings in each cross-attention layer of the denoising autoencoder. Then, we obtain the gradient of the log posterior of context vectors, which can be updated and transferred to the subsequent cross-attention layer, thereby implicitly minimizing a nested hierarchy of energy functions. Our latent EBMs further allow zero-shot compositional generation as a linear combination of cross-attention outputs from different contexts. Using extensive experiments, we demonstrate that the proposed method is highly effective in handling various image generation tasks, including multi-concept generation, text-guided image inpainting, and real and synthetic image editing.
Uncertainty Quantification for Deep Regression using Contextualised Normalizing Flows
Marco, Adriel Sosa, Kirwan, John Daniel, Toumpa, Alexia, Gerasimou, Simos
Quantifying uncertainty in deep regression models is important both for understanding the confidence of the model and for safe decision-making in high-risk domains. Existing approaches that yield prediction intervals overlook distributional information, neglecting the effect of multimodal or asymmetric distributions on decision-making. Similarly, full or approximated Bayesian methods, while yielding the predictive posterior density, demand major modifications to the model architecture and retraining. We introduce MCNF, a novel post hoc uncertainty quantification method that produces both prediction intervals and the full conditioned predictive distribution. MCNF operates on top of the underlying trained predictive model; thus, no predictive model retraining is needed. We provide experimental evidence that the MCNF-based uncertainty estimate is well calibrated, is competitive with state-of-the-art uncertainty quantification methods, and provides richer information for downstream decision-making tasks.
- Europe > Spain (0.14)
- Europe > Middle East > Cyprus (0.04)
- Oceania > Australia > Tasmania (0.04)
- (3 more...)
- North America > United States > Texas (0.04)
- North America > United States > Alabama (0.04)
- Asia > Middle East > Jordan (0.04)
- (4 more...)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Exponential Family Embeddings
In this paper, we develop exponential family embeddings, a class of methods that extends the idea of word embeddings to other types of high-dimensional data. As examples, we studied neural data with real-valued observations, count data from a market basket analysis, and ratings data from a movie recommendation system. The main idea is to model each observation conditioned on a set of other observations. This set is called the context, and the way the context is defined is a modeling choice that depends on the problem.
- North America > Canada > Ontario > Toronto (0.14)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Asia > South Korea > Daejeon > Daejeon (0.04)
- Asia > Middle East > Israel (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Implicit Federated In-context Learning For Task-Specific LLM Fine-Tuning
Li, Dongcheng, Chen, Junhan, Zhou, Aoxiang, Li, Chunpei, Xian, Youquan, Liu, Peng, Li, Xianxian
As large language models continue to develop and expand, the extensive public data they rely on faces the risk of depletion. Consequently, leveraging private data within organizations to enhance the performance of large models has emerged as a key challenge. The federated learning paradigm, combined with model fine-tuning techniques, effectively reduces the number of trainable parameters. However,the necessity to process high-dimensional feature spaces results in substantial overall computational overhead. To address this issue, we propose the Implicit Federated In-Context Learning (IFed-ICL) framework. IFed-ICL draws inspiration from federated learning to establish a novel distributed collaborative paradigm, by converting client local context examples into implicit vector representations, it enables distributed collaborative computation during the inference phase and injects model residual streams to enhance model performance. Experiments demonstrate that our proposed method achieves outstanding performance across multiple text classification tasks. Compared to traditional methods, IFed-ICL avoids the extensive parameter updates required by conventional fine-tuning methods while reducing data transmission and local computation at the client level in federated learning. This enables efficient distributed context learning using local private-domain data, significantly improving model performance on specific tasks.